IMPLEMENTING THE ASYMPTOTICALLY FAST VERSION OF THE 
ELLIPTIC CURVE PRIMALITY PROVING ALGORITHM 



F. MORAIN 

Abstract. The elliptic curve primality proving (ECPP) algorithm is one of the current fastest 
practical algorithms for proving the primality of large numbers. Its running time cannot be proven 
rigorously, but heuristic arguments show that it should run in time 0((log N) ) to prove the primal- 
ity of N. An asymptotically fast version of it, attributed to J. O. Shallit, runs in time 0((logA r ) 4 ). 
The aim of this article is to describe this version in more details, leading to actual implementations 
able to handle numbers with several thousands of decimal digits. 



1. Introduction 

From the work of Agrawal, Kayal and Saxena [2], we know that determining the primality of 
an integer N can be done in proven deterministic polynomial time 0((log iV) 10 ' 5 ). More recently, 
H.-W. Lenstra and C. Pomerance have announced a version in 0((log A r ) 6 ). Building on the work 
of P. Berrizbeitia 1 7 ], D. Bernstein jS] and P. Mihailescu & R. Mocenigo 3H, independently, have 
given improved probabilistic versions with a claim of proven complexity of 0((logiV) 4 ), reusing 
classical cyclotomic ideas that originated in the Jacobi sums test . For more on primality 
before AKS, we refer the reader to [T3|- For the recent developments, see jS]. 

All the known versions of the AKS algorithm are for the time being too slow to prove the primality 
of large explicit numbers. On the other hand, the elliptic curve primality proving algorithm [3] 
has been used for years to prove the primality of always larger numbers*. The algorithm has a 
heuristic running time of 0((log iV) 5 ). In the course of writing [33], the author rediscovered the 
article |28| . in which an asymptotically fast version of ECPP is described. This version, attributed 
to J. O. Shallit, has a heuristic running time of 0((log iV) 4 ). The aim of this paper is to describe 
fastECPP, give a heuristic analysis of it and describe its implementation. 

Section 2 collects some well-known facts on imaginary quadratic fields, that can be found for 
instance in ^3j- Section 3 presents the basic ECPP algorithm and analyzes it. In Section 4, the 
fast version is described and its complexity estimated. Section 5 explains the implementation and 
Section 6 gives some actual timings on large numbers. 

2. Quadratic fields 

A discriminant — D < is said to be fundamental if and only if D is free of odd square prime 
factors, and moreover D = 3 mod 4 or when 4 | D, (D/4) mod 4 G {1, 2}. The quantity 

V(X) = #{D < X, -D is fundamental} 

is easily seen to be 0{X). 
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A fundamental discriminant may be written as: 

i=l 

where all q*'s are distinct and q* is either —4 or ±8, or q* = (—1/%)% for any prime The 
number of genera is g(—D) = 2* _1 and Gauss proved that this number divides the class number 
h(-D) of K = Q(^/^D). Moreover, Siegel proved that h = 0(D l / 2+£ ) asymtotically. 

The rational prime p is the norm of an integer in K, or equivalently, Ap = U 2 + DV 2 in rational 
integers U and V if and only if the ideal (p) splits completely in the Hilbert class field of K, denoted 
K#, an extension of degree h(—D) of K. The probability that a prime p splits in K is l/(2h(—D)). 

Using Gauss's theory of genera of forms, it is known that if (-^-) = 1 for all i (equivalently, (p) 
splits in the genus field of K), then the probability of (p) splitting in K# is g(—D)/h(—D). 

3. The basic ECPP algorithm 

We present a rough sketch of the ECPP algorithm, enough for us to estimate its complexity. We 
do not insist on what happens if one of the steps fails, revealing the compositeness of N. More 
details can be found in [2]. 

3.1. Elliptic curves over Z/NZ. For us, an elliptic curve E modulo N will have an equation 
Y 2 = X 3 + aX + b with gcd(4a 3 + 27b 2 , N) = 1 and we will use the set of points E(Z/NZ) defined 
as: 

E(Z/NZ) ={(x:y:z)e F 2 (Z/NZ),y 2 z = x 3 + axz 2 + bz 3 } U {0 E = (0:1:0)} 
which ressembles the definition of an actual elliptic curve if ./V is prime, F 2 (Z/NZ) being the 
projective plane over Z/NZ. The important point here is that if p is a divisor of N, we can reduce 
the curve E and a point P on it via a reduction modulo p of each integer, yielding a point P p on E p . 
Moreover, we can define an operation on E(Z/NZ), called pseudo- addition, that adds two "points" 
P and Q with the usual chord-and-tangent law. This operation either yields a point R or a divisor 
of N if any is encountered when dividing. If R exists, then it has the property that R p is the sum 
of P p and Q p on E p for all prime factors p of N. Note also that Oe reduces to the ordinary point 
at infinity on E p . 

We will need to exponentiate points in E. This is best defined using the division polynomials 
(see for instance [I] for a lot of properties on these). Remember that over a field K there exist 
polynomials 4> m (X,Y), tp m (X,Y), Lu m (X,Y) such that 

(1) [m]P = P + . . . + P = [m](X, Y) = (</> m (X, Y)^ m {X, Y) : u m (X, Y) : $JX, Y)) . 

V 

m times 

All these polynomials can be computed via recurrence formulas and there is a O(logm) algorithm 
for this task (a variant of the usual binary method for exponentiating). 

We will take for the definition of [m]P over Z/NZ. We note here that if ip m (X, Y) = 0, then 
[m]P is equivalent to the point Oe- 

For the sake of presenting the algorithm in a simplified setting, we prove (compare |25| ) : 

Proposition 3.1. Let N' a prime satisfying {y/~N — l) 2 < 2N' < (y/N + l) 2 . Suppose that 
E(Z/NZ) is a curve over Z/NZ, that P = (x : y : 1) is such that gcd(y,N) = 1, ip 2 N'( x jy) = 
but gcd(ip]yi(x, y), N) = 1. Then N is prime. 

Proof: suppose that iV is composite and that p < y/N is one of its prime factors. Let us look 
at what happens modulo p. By construction, P p is not a 2-torsion point on E p . Since ipN'(x,y) 
is invertible modulo p, then [N'](P p ) ^ Oe p and therefore P p is of order 2N' modulo p. This is 
impossible, since 2N' > (\/N - l) 2 > (p - l) 2 > (y/p - l) 2 > #E P by Hasse's theorem. □ 
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3.2. Presentation of the algorithm. We want to prove that N is prime. The algorithm runs as 
follows: 



[Step 1.] Repeat the following: Find an imaginary quadratic field K = Q(y/—D) of discriminant 
—D, D > 0, such that 

(2) AN = U 2 + DV 2 

in rational integers U and V. For all solutions U of PJ). compute m = N + 1 — U; if one of these 
numbers is twice a probable prime N' , go to Step 2. 

[Step 2.] Build an elliptic curve E over Q having complex multiplication by the ring of integers Ok 
of K. 

[Step 3.] Reduce E modulo N to get a curve E. 

[Step 4.] Find P = (x : y : 1), gcd(y, N) = 1 on E such that ip2N'(%, v) = 0, but gcd(ipNi(x, y), N) = 
1. If this cannot be done, then TV is composite, otherwise, it is prime by Proposition 13.11 

[Step 5.] Set N = N' and go back to Step 1. 

3.3. Analyzing ECPP. We will now analyze all steps of the above algorithm and give complexity 
estimates using the parameter L = log N. One basic unit of time will be the time needed to multiply 
two integers of size L, namely 0(L 1+At ), where 0</i<l( / u = l for ordinary multiplication, or 
e > for any fast multiplication method). 

Clearly, we need log N steps for proving the primality of N. We consider all steps, one at a time, 
easier steps first. 

3.4. Analysis of Step 4. Finding a point P can be done by a simple algorithm that looks for the 
smallest x such that x 3 + ax + b is a square modulo p and then extracting a squareroot modulo 
p, for a cost of 0((log N) 2+>1 ). Note that we can do without this with the trick described in [3J 
§8.6.3], though we do not need this at this point. 

Computing ip^/(x,y) costs 0((log N) 2+>1 ), and we need O(l) points on average, so this steps 
amounts for 0((log N) 2+IM ). 

3.5. Analyzing Step 2. The original version is to realize K#/K via the computation of the 
minimal polynomial iJo(X) of the special values of the classical j'-invariant at quadratic integers. 
More precisely, we can view the class group Cl(—D) of K as a set of primitive reduced quadratic 
forms of discriminant —D. If (A, B, C) is such a form, with B 2 — AAC = —D, then 

Hd{X)= J] (x-j((-B + V^D)/(2A))). 

(A,B,C)eCl(-D) 

In ^S], it is argued that the height of this polynomial is well approximated by the quantity: 

[A,B,C]eCl(-D) 

which can be shown to be 0((logh) 2 ). 

Evaluating the roots of Hd(X) and building this polynomial can be done in 0{h 2 ) operations 
(see JHl)- Note that this step does not require computations modulo N. 

Alternatively, we could use the method of |12[ |Hj for computing the class polynomial and get a 
proven running time of 0(h 2 ), but assuming GRH. 
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3.6. Analyzing Step 3. Reducing E modulo N is done by finding a root of Hrj(X) modulo N. 
This can be done with the Cantor-Zassenhaus algorithm (see [22] for instance). Briefly, we split 
recursively Hd(X) by computing gcd((X + a)^ -1 " 2 — 1, Hd(X)) mod N for random a's. 

Computing (X + a)^' 1 ^ 2 mod (N, H D {X)) costs 0((log N)M(N, h)) = 0{LU(N,h)) where 
M(N,d) is the time needed to multiply two degree d polynomials modulo N. A gcd of two degree 
d polynomials costs M(N, d)logd (see [221 Ch. 11]). The total splitting requires log/i steps, but 
the overall cost is dominated by the first one, hence yields a time: 

0(M(A7,/i)max(L,log h)). 

We can assume that M(N, d) = 0(d l+u L l+ ^) where again < v < 1. 

3.7. Analysis of Step 1. This is the crucial step that will give us the clue to the complexity. Given 
D, testing whether (J2J) is satisfied involves the reduction of the ideal (N, r ~ y C~ D ) ^ na ^ nes a bove (N) 
in K, where r 2 = —D mod (4iV) (if N is prime...). This requires the computation of \f—D mod N, 
using for instance the Tonelli-Shanks algorithm, for the cost of one modular exponentiation, i.e., a 
0(L 2+ ^) time. Then it proceeds with a half gcd like computation, for a cost of 0(L 1+ ^) (see also 
section 15.21 below) . 

In the event that equation (J2J) is solvable, then we need check that m = 2N' and test N' for 
primality, which costs again some 0(L 2+>M ). 

The heuristic probability of m being of the given form is 0(1/L). Though quite realistic, it 
is impossible to prove, given the current state of the art in analytical number theory. Using this 
heuristics, we expect to need O(L) splitting L>'s. Let us take all discriminants less than -D max - 
They have class number close to h{— D max ) = 0{\J D mayL ) and there are 0(-D max ) of them. We see 
that if L = 0(D m&x )/VD max , then among these discriminants, one will lead to a useful m. We 
conclude that D max = 0(L 2 ) should suffice. 

Turning to complexity, the cost of Step 1 is then that of 0(L 2 ) solving of (J2J), followed by 0(L) 
probable primality tests: 

0(L 2 { L 2 ^ + L 1+ ^)) + 0(L- ). 

yj —Y) mod N reduction probable primality 

which is dominated by the first cost, namely 0(L 4+/1 ). 

3.8. Adding everything together. Taking D = 0(L 2 ) readily implies h = O(L), so that the 
cost of Step 2 is 0(L 2 ), and that of Step 3 is 0((log L)L 3+ ^ +u ), which dominates Step 4. All in 
all, we get that ECPP has heuristic complexity 0(L 4+IM ) for one step, and therefore 0(L 5+M ) in 
totality. 

3.9. Remark. In practice, the dominant term of the complexity of Step 1 is 0(n£)L 2+/i ) where no 
is the number of Vs for which we try solve equation (j2J). Depending on implementation parameters 
and real size of N, this number no can be quite small. This gives a very small apparent complexity 
to ECPP, somewhere in between L 3 and L 4 and explains why ECPP seems so fast in practice (see 
for instance [2T]^. 

4. The fast version of ECPP 

4.1. Presentation. When dealing with large numbers, all the time is spent in the finding of D, 
which means that a lot of squareroots modulo N must be computed. A first way to reduce the 
computations, alluded to in J3J §8.4.3], is to accumulate squareroots, and reuse them, at the cost 
of some multiplications. For instance, if one has v 73 ^ and y/E = y/^20/y/^A, then we can build 
V— 15, etc. 
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A better way that leads to the fast version consists in computing a basis of small squareroots and 
build discriminants from this basis. Looking at the analysis carried out above, we see that we need 
0{L 2 ) discriminants to find a good one. The basic version finds them by using all discriminants 
that are of size 0(L 2 ). As opposed to this, one can build those discriminants as —D = (—p)(q), 
where p and q are taken from a pool of size O(L) primes. 

More formally, we replace Step 1. by Step 1'. as follows: 

[Step 1'.] 

1.1. Find the r = O(L) smallest primes q* such that (j?) = 1, yielding Q = {q*,q%, . . . , q*}. 

1.2. Compute all yfq* mod N for q* 6 Q. 

1.3. For all pairs (q^,^ ) of Q for which q^q* = —D < 0, try to solve equation ©. 

The cost of this new Step 1 is that of computing r = O(L) squareroots modulo N, for a cost of 
0(L ■ L 2+ ^). Then, we still have 0{L 2 ) reductions. The new overall cost of this phase decreases 
now to: 

0(L- )+0{L 2 - L^) + 0{L- L 2+ ^ ) 

squareroots reduction probable primality 

which yields namely 0(L 3+fl ). Note here how the complexity decomposes as 3 = 1 + 2 or 2 + 1 
depending on the sub-algorithms. 

Putting everything together, we end up with a total cost of 0(L 4+/i ) for this variant of ECPP. 

4.2. Remarks. 

4.2.1. Complexity issues. We can slightly optimize the preceding argument, by using all subsets of 
Q and not only pairs of elements. This would call for r = (log log N), since then 2 r = L 2 could 
be reached. Though useful in practice, this phase no longer dominates the cost of the algorithm. 

Moreover, we can see that several phases of fastECPP have cost 0(L 3 ), which means that we 
would have to fight hard to decrease the overall complexity below 0(L ). 

4.2.2. A note on discriminants. Note that we use fundamental discriminants only, as non funda- 
mental discriminants lead to curves that do not bring anything new compared to fundamental ones. 
Indeed, if T> = f 2 D, with D fundamental, then there is a curve having CM by the order of discrim- 
inant V. Writing AN = U 2 + Df 2 V 2 , its cardinality is iV + 1 - U, the same as the corresponding 
curve associated to D. 

4.2.3. A note on class numbers. As soon as we use composite discriminants — D of the form q* g| , 
Gauss's theorem tells us that the class number h{—D) is even. This could bias our estimation, but 
we conjecture that the effect is not important. 

5. Implementation 

5.1. Computing class numbers. In order to make the search for D £ T> efficient, it is better to 
control the class number beforehand. Tables can be made, but for larger computations, we need 
a fast way to compute h{—D). Subexponential methods exist, assuming the Generalized Riemann 
Hypothesis. From a practical point of view, our D : s are of medium size. Enumerating all forms 
costs 0(h 2 ) with a small constant, and Shanks's baby-steps/giant-steps algorithm costs 0{sfh) but 
with a large constant. It is better here to use the explicit formula of Louboutin |29| that yields a 
practical method in 0{h) with a very small constant. 
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5.2. An improved Cornacchia algorithm. Step 1 needs squareroots to be computed, and some 
half gcd to be performed. Briefly, Cornacchia's algorithm runs as follows (see |34| ) : 

procedure CORNACCHlA(d, p, t) 

{t is such that t 2 = —d mod p, p/2 < t < p } 

a) r_ 2 = P, r~i = t ; W- 2 = 0, w-i = 1; 

b) for i > while rj_i > Jp do 

rj_2 = ain-i + n, o < n < n-i ; 

Wi = Wi_2 + a%Wi-i (*) ; 

c) if r? 1 + dw 2 _ 1 = p then return (rj_i, li^-i) else return 0. 

We end the for loop once we get r,_i < ^fp < rj_2- As is well known, the Oj's are quite small 
and we can guess their size by monitoring the number of bits of the rj's, thus limiting the number 
of long divisions. One can use a fast variant for this half gcd if needed, in a way reminiscent of 
Knuth. 

Moreover, from the theory, we know that this algorithms almost always returns that the empty 
set in step 2c), since the probability of success if l/(2/i(— d)). Therefore, when h is large, we can 
dispense of the multiprecision computations in equation (*). We replace it by single precision 
computations: 

Wi = Wi-2 + aiWi-i mod 2 32 

and at the end, we test whether rf_ l + dwf_ 1 = p mod 2 32 . If this is the case, then we redo the 
computation of the w^s and check again. 

5.3. Factoring m. Critical parameters are that related to the factorization of m, since in practice 
we try to factor m to get it of the form cN' for some S-smooth number c. 

As shown in j^Hl > the number of probable prime tests we will have to perform is t = 0((log N) /(log B)) 
and we will end up with N' such that N/N' ~ logB. 

For small numbers, we can factor lots of m doing the following. In a first step, we compute 

fi = (N + 1) mod pi 

for all pi < -B's, which costs 7r(i?)(log A^) 1+£ , where tt(B) = 0(B/ log B) is the number of primes 
below B and the other term being the time needed to divide a multi-digit number by a single digit 
number. 

Then, sieving both m = N + 1 — U and m' = N + 1 + U is done by computing m = U mod 
and comparing it to ±r, for primes pi such that (—D/pi) ^ —1. See E2] for more details and 
tricks. 

The cost of this algorithm for t values of m is 

0(it(B) (log N) 1+£ ) + 0(t ■ vr(S)(logiV) 1+e ) 

where the second term is that for computing U mod pi, which is slightly half that of (N+ 1) mod pi, 
since U is 0(\/~N). Since we need to perform also t probable prime tests (say, a plain Fermat one), 
then the cost is 

0(tBL) + 0(tL 2+ ») = 0(BL 2 ) + <3(L 3+M ) 

and therefore the optimal value for B is B = O(L). 

For larger numbers, it is better to use the stripping factor algorithm in |2U| . for a cost of 
0(B(logB) 2 ), the optimal value of B being B = 0((logN) 3 ). 
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5.3.1. Remark. Suppose now that we have found N' and that m = N +1 — U = cN' . Then we will 
have to compute 

r ■ = (N + 1) mod pi 

which may be computed as: 

r- = (n - Ui)/c + 1 mod p 4 . 
Computing the right hand side is faster, since c is ordinarily small compared to N'. 

5.3.2. Using an early abort strategy. This idea is presented in [20]. We would like to go down as 
fast as possible. So why not impose N/N' greater than some given bound? Candidates N' need be 
tested for probable primality only if this bound is met. From what has been written above, we can 
insist on N/N' ~ log\B. In practice, we used a bound 5 and used N/N' > 2 s . 

5.3.3. Using new invariants. Proving larger and larger numbers forces us to use larger and larger 
-D's, leading to larger and larger polynomials Hp. For this to be doable, new invariants had to be 
used, so as to minimize the size of the minimal polynomials. This task was done using Schertz's 
formulation of Shimura's reciprocity law with the invariants of ^Sj as demonstrated in 
(alternatively see [2U 123])- Note that replacing j by other functions does not the change the 
complexity of the algorithm, though it is crucial in practice. 

5.3.4. Step 3 in practice. We already noted that this step is the theoretically dominating one in 
fastECPP, with a cost of 0((log L)L 3+At+I/ ). In practice, even for small values of h, we can assume 
v ~ (using for instance the algorithm of 30] for polynomial multiplication). 

Galois theory comes in handy for reducing the log L term to a log log L one, if we insist on h being 
smooth. Then, we replace the time needed to factor a degree h polynomial by a list of smaller ones, 
the largest prime factor of h being log/i. We already used that in ECPP, using [23 El- Typical 
values of h are now routinely in the 10000 zone. 

It could be argued that keeping only smooth class numbers is too restrictive. Note however, that 
class numbers tend to be smoother than ordinary numbers |l()j . 

5.3.5. Improving the program. The new implementation uses GMP^ for the basic arithmetic, which 
enables one to use mpfr [2E, and mpc ^2]> thus leading to a complete program that can compute 
polynomial Hps on the fly, contrary to the author's implementation of ECPP, prior to version 
11.0.5. This turned out to be the key for the new-born program to compete with the old one. 

5.4. fastECPP. We give here the expanded algorithm corresponding to step 1'. Using a smooth- 
ness bound B, we need approximately t = exp(— 7) log N/ log-B values of m and therefore roughly 
t/2 discriminants. The probability that D is a splitting discriminant is g(—D)/h(—D). Therefore 
we build discriminants until 

5>(-£>)/fc(-£0*t/2. 

D 

One way of building these discriminants is the following: we let r increase and build all or some 
of the subsets of {ql, . . . ,q*} until the expected number of D's is reached. After this, we sort the 
discriminants with respect to (h(—D)/g(—D),h(—D),D) and treat them in this order. 



^http: //www. swox.com/gmp/ 
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6. Benchmarks 



First of all, it should be noted that ECPP is not a well defined algorithm, as long as one does 
not give the list of discriminants that are used, or the principles that generate them. 

Since the first phase of ECPP requires a tree search, testing on a single number does not reveal 
too much. Averaging on more than 20 numbers is a good idea. 

Our current implementation uses GMP* for the basic arithmetic, which enables one to use mpf r 
|26| and mpc |19j . thus leading to a complete program that can compute polynomial Hp's on the 
fly, contrary to the author's implementation of ECPP, prior to version 11.0.5. This turned out to 
be the key for the new-born program to compete with the old one. 

We give below some timings obtained with this implementation, after a lot of trials. We used as 
prime candidates the first twenty primes of 1000, 1500, and 2000 decimal digits. Critical parameters 
are as follows: we used D < 10 7 , h < 1000, 5 = 12 (see section EQj) . For 1000 and 1500 decimal 
digits, we limited the largest prime factor of h to be < 30 and for 2000 dd, it was put to 100. 
This parameter has an influence in Step 3. For the extraction of small prime factors (used in the 
algorithm described in [20] and denoted EXTRACT in the sequel), we used B = 8 • 10 6 , 10 7 , 3 • 10 7 
for the three respective sizes. 

SQRT refers to the computation of the y/qf, CORN to Cornacchia, PRP to probable primality 
tests; HD is the time for computing polynomials Hjj using the techniques described in jmod 
the time to solve it modulo p; then 1st refers to the building phase (step 1), 2nd to the other ones; 
total is the total time, check the time to verify the certificate. Follow some data concering D, h and 
the size of the certificates (in kbytes). All timings are cumulated CPU time on an AMD Athlon 64 
3400+ running at 2.4GHz. 





min 


max 


avg 


std 


SQRT 


19 


34 


25 


3 


CORN 


10 


24 


17 


4 


EXTRACT 


60 


84 


74 


5 


PRP 


74 


124 


102 


14 


HD 





7 


2 


2 


jmod 


42 


99 


61 


11 


1st 


178 


276 


234 


27 


2nd 


79 


136 


99 


12 


total 


260 


387 


334 


34 


check 


18 


22 


20 





nsteps 


124 


156 


143 


7 


certif 


396 


456 


435 


13 


D 




8740947 


120639 


608050 


h 




1000 


31 


87 



Table 1. 1000 decimal digits 



Looking at the average total time, we see that it follows very closely the 0((log iV) 4 ) prediction. 
Note also that the dominant time is that of the PRP tests, and that all phases have time close to 
what was predicted. 

*http: //www. swox.com/gmp/ 





mm 


max 


avg 




SORT 


1 14 


497 


1 71 


65 




oy 


1 AC\ 


yo 


91 
Z 1 


FYTR APT 




989 




90 
ZU 


PRP 


472 






qq 


HD 


R 


1 3 


9 


2 


1 TY1 ( \l\ 


— X c/ 


471 


334 


60 




ouo 


ioyu 


1 1 Q9 


1 8R 
loO 


znu 


oDo 


D4y 


OUo 


( u 


total 


1322 


2240 


1 701 


930 


check 


71 


94 


85 


5 


nsteps 


183 


209 


198 


7 


certif 


796 


968 


897 


40 


D 




9644776 


201015 


848112 


h 




972 


46 


111 


Table 2. 


1500 decimal digits 






min 


max 


avg 


StQ 


SQRT 


384 


820 


516 


1 90 


CORN 


181 


390 


260 


RR 


EXTRACT 


600 


853 


713 


D 1 


PRP 


1761 


2879 


2227 


306 


HD 


6 


27 


16 




jmod 


969 


1539 


1255 


1 88 


1st 


2974 


4888 


3778 


R9R 
OZo 


2nd 


1398 


2120 


1777 


991 


total 


4494 


6795 


5557 


711 


check 


213 


261 


238 


13 


nsteps 


236 


262 


248 


7 


certif 


1420 


1644 


1539 


64 


D 




9760387 


285217 


1026529 


h 




1000 


63 


130 



Table 3. 2000 decimal digits 



7. Conclusions 

We have described in greater details the fast version of ECPP. We have demonstrated its ef- 
ficiency. As for ECPP, it is obvious that the computations can be distributed over a network of 
computers. We refer the reader to [20] for more details. Note that the current record of 15041 
decimal digits (with the number 44 05 2638 + 26 38 4405 see transaction in the NMBRTHRY mailing 
list), was settled using this approach. Many more numbers were proven prime using either the 
monoprocessor version or the distributed one, most of them from the tables of numbers of the form 
x v _|_ y x ma de by P. Leyland^. 

Cheng [2] has suggested to use ECPP to help his improvement of the AKS algorithm, forcing 
m = cN' to have N 1 — 1 divisible by a given prime large prime of size 0((log N) 2 ). The same idea 
can be used to speed up the Jacobi sums algorithm, and this will be detailed elsewhere. 

: //www. leyland. vispa. com/numth/primes/xyyx .htm 
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